干货|深度学习项目流程(DLSS2016上Andrew Ng讲座摘要)（附Github）

2017-03-06 全球人工智能

全球人工智能

来源：GitHub

深度学习项目工作流程

This document attempts to summarize Andrew Ng's recommended machine learning workflow from his"Nuts and Bolts of Applying Deep Learning"talk at Deep Learning Summer School 2016. Any errors or misinterpretations are my own.

从这里开始

Measure Human-level performance on your task.
Do your training and test data come from the same distribution?

测量人的水平表现

The real goal of measuring human-level performance is to estimate theBayes Error Rate. Knowing your Bayes Error Rate helps you figure out if your model is underfitting or overfitting your training data. More specifically, it will let us measure 'Bias' (as Ng defines it), which we use later in the workflow.

如果你的训练和测试数据来自于相同的分布

1. 把你的数据整理和划分成Train / Dev / Test 集

Ng recommends a Train / Dev / Test split of approximately 70% / 15% / 15%.

2. 衡量你的训练误差和开发设定误差，并计算拜厄斯和方差

Calculate your bias and variance as:

Bias = (Training Set Error) - (Human Error)
Variance = (Dev Set Error) - (Training Set Error)

3. 出现High Bias了吗?首先修复它

An example of high bias:

Fix high biasbefore going on to the next step.

4. 方差很高吗？修复它。

An example of high variance:

Once youFix Your High Variancethen you're done!

如果你的训练和测试数据不是来自于相同的分布

1. 划分你的数据

If your train and test data come from different distributions, make sure at least your dev and test sets are from the same distribution. You can do this by taking your test set and using half as dev and half as test.

Carve out a small portion of your training set (call thisTrain-Dev) and split your Test data intoDevandTest:

2. 测量您的误差，并计算相关指标

Calculate these metrics to help know where to focus your efforts:

3. 有高的Bias?修复它！

An example of high bias:

4.方差很高吗？修复它。

An example of high variance:

Fix your high variancebefore going on to the next step.

4.你的训练或测试不匹配吗？修复它

An example of train/test mismatch:

Fix Your Train/Test Mismatchbefore going on to the next step.

5. 你的Dev Set出现过度拟合吗?修复它

An example of overfitting your dev set:

Once youfix your dev set overfitting, you're done!

如何修复高Bias

Ng suggests these ways for fixing a model with high bias:

Try a bigger model
Try training longer
Try a new model architecture (this can be hard)

如何修复高方差

Ng suggests these ways for fixing a model with high variance:

Get more data

This includes data synthesis and data augmentation

Try adding regularization
Try early stopping
Try new model architecture (this can be hard)

训练和测试失配，如何调整

Ng suggests these ways for fixing a model with high train/test mismatch:

Try to get more data similar to your test data
Try data synthesis and data augmentation
Try new model architecture (this can be hard)

如何解决你Dev Set的过度拟合

Ng suggests only one way of fixing dev set overfitting:

Get more dev data

Presumably this would include data synthesis and data augmentation as well.

资源：https://github.com/thomasj02/DeepLearningProjectWorkflow

热门文章推荐

19岁中国留学生投票被抓，“假装”公民身份！且已无法撤回.........

中国留学生在美国非法投票，后果很严重

19岁中国留学生非法投票美国大选，被控2罪！或被判15年监禁

恶魔医生刘翔峰判了，湘雅二院改好了吗？

中国在南极发现的“海上粮仓”能养活14亿人？

干货|深度学习项目流程(DLSS2016上Andrew Ng讲座摘要)（附Github）

1. 把你的数据整理和划分成Train / Dev / Test 集

如果你的训练和测试数据不是来自于相同的分布

2. 测量您的误差，并计算相关指标

3. 有高的Bias?修复它！

4.方差很高吗？修复它。

如何修复高Bias

重磅|Messenger bot错误率高达70% Facebook被迫削减AI投资

讨论|周志华教授gcForest论文的价值与技术讨论（微信群）

最新|李飞飞：人口普查不用上门，谷歌街景加深度学习就搞定（附论文）

最新 | 百度最新“Deep Voice”语音技术比WaveNet提速 400 倍（译）

重磅 |Boston Dynamics最新轮式暴跳机器人：身高1.98m 纵跳1.2m（译）

重磅 |“AI武器”已成特朗普媒体舆论首选工具！

技术 | 教你如何用“决策树算法”解决相亲问题？

您可能也对以下帖子感兴趣

19岁中国留学生投票被抓，“假装”公民身份！且已无法撤回.........

中国留学生在美国非法投票，后果很严重

19岁中国留学生非法投票美国大选，被控2罪！或被判15年监禁

恶魔医生刘翔峰判了，湘雅二院改好了吗？

中国在南极发现的“海上粮仓”能养活14亿人？

生成图片，分享到微信朋友圈

干货|深度学习项目流程(DLSS2016上Andrew Ng讲座摘要)（附Github）

1. 把你的数据整理和划分成Train / Dev / Test 集

如果你的训练和测试数据不是来自于相同的分布

2. 测量您的误差，并计算相关指标

3. 有高的Bias?修复它！

4.方差很高吗？修复它。

如何修复高Bias

重磅|Messenger bot错误率高达70% Facebook被迫削减AI投资

讨论|周志华教授gcForest论文的价值与技术讨论（微信群）

最新|李飞飞：人口普查不用上门，谷歌街景加深度学习就搞定（附论文）

最新 | 百度最新“Deep Voice”语音技术 比WaveNet提速 400 倍（译）

重磅 |Boston Dynamics最新轮式暴跳机器人：身高1.98m 纵跳1.2m（译）

重磅 |“AI武器”已成特朗普媒体舆论首选工具！

技术 | 教你如何用“决策树算法”解决相亲问题？

您可能也对以下帖子感兴趣

最新 | 百度最新“Deep Voice”语音技术比WaveNet提速 400 倍（译）